1,435 research outputs found
Steered mixture-of-experts for light field images and video : representation and coding
Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution
A multi-configuration part-based person detector
Proceedings of the Special Session on Multimodal Security and Surveillance Analytics 2014, held during the International Conference on Signal Processing and Multimedia Applications (SIGMAP 2014) in ViennaPeople detection is a task that has generated a great interest in the computer vision and specially in the surveillance
community. One of the main problems of this task in crowded scenarios is the high number of occlusions
deriving from persons appearing in groups. In this paper, we address this problem by combining individual
body part detectors in a statistical driven way in order to be able to detect persons even in case of failure of any
detection of the body parts, i.e., we propose a generic scheme to deal with partial occlusions. We demonstrate
the validity of our approach and compare it with other state of the art approaches on several public datasets.
In our experiments we consider sequences with different complexities in terms of occupation and therefore
with different number of people present in the scene, in order to highlight the benefits and difficulties of the
approaches considered for evaluation. The results show that our approach improves the results provided by
state of the art approaches specially in the case of crowded scenesThis work has been done while visiting the Communication
Systems Group at the Technische UniversitÀt Berlin (Germany) under the supervision of Prof.
Dr.-Ing. Thomas Sikora. This work has been partially
supported by the Universidad AutÂŽonoma de Madrid
(âPrograma propio de ayudas para estancias breves
en España y extranjero para Personal Docente e Investigador
en FormaciĂłn de la UAMâ), by the Spanish
Government (TEC2011-25995 EventVideo) and
by the European Communityâs FP7 under grant agreement
number 261776 (MOSAIC)
Fast structural changes (200â900 ns) may prepare the photosynthetic manganese complex for oxidation by the adjacent tyrosine radical
The Mn complex of photosystem II (PSII) cycles through 4 semi-stable states
(S0 to S3). Laser-flash excitation of PSII in the S2 or S3 state induces
processes with time constants around 350 ns, which have been assigned
previously to energetic relaxation of the oxidized tyrosine (YZox). Herein we
report monitoring of these processes in the time domain of hundreds of
nanoseconds by photoacoustic (or âoptoacousticâ) experiments involving
pressure-wave detection after excitation of PSII membrane particles by ns-
laser flashes. We find that specifically for excitation of PSII in the S2
state, nuclear rearrangements are induced which amount to a contraction of
PSII by at least 30 Ă
3 (time constant of 350 ns at 25 °C; activation energy of
285 +/â 50 meV). In the S3 state, the 350-ns-contraction is about 5 times
smaller whereas in S0 and S1, no volume changes are detectable in this time
domain. It is proposed that the classical S2 = > S3 transition of the Mn
complex is a multi-step process. The first step after YZox formation involves
a fast nuclear rearrangement of the Mn complex and its proteinâwater
environment (~ 350 ns), which may serve a dual role: (1) The Mnâ complex
entity is prepared for the subsequent proton removal and electron transfer by
formation of an intermediate state of specific (but still unknown) atomic
structure. (2) Formation of the structural intermediate is associated
(necessarily) with energetic relaxation and thus stabilization of YZox so that
energy losses by charge recombination with the QAâ anion radical are
minimized. The intermediate formed within about 350 ns after YZox formation in
the S2-state is discussed in the context of two recent models of the S2 = > S3
transition of the water oxidation cycle. This article is part of a Special
Issue entitled: Photosynthesis Research for Sustainability: From Natural to
Artificial
Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference
In Optical Coherence Tomography (OCT), speckle noise significantly hampers
image quality, affecting diagnostic accuracy. Current methods, including
traditional filtering and deep learning techniques, have limitations in noise
reduction and detail preservation. Addressing these challenges, this study
introduces a novel denoising algorithm, Block-Matching Steered-Mixture of
Experts with Multi-Model Inference and Autoencoder (BM-SMoE-AE). This method
combines block-matched implementation of the SMoE algorithm with an enhanced
autoencoder architecture, offering efficient speckle noise reduction while
retaining critical image details. Our method stands out by providing improved
edge definition and reduced processing time. Comparative analysis with existing
denoising techniques demonstrates the superior performance of BM-SMoE-AE in
maintaining image integrity and enhancing OCT image usability for medical
diagnostics.Comment: This submission contains 10 pages and 4 figures. It was presented at
the 2024 SPIE Photonics West, held in San Francisco. The paper details
advancements in photonics applications related to healthcare and includes
supplementary material with additional datasets for revie
Blip10000: a social video dataset containing SPUG content for tagging and retrieval
The increasing amount of digital multimedia content available is inspiring potential new types of user interaction with video data. Users want to easilyfind the content by searching and browsing. For this reason, techniques are needed that allow automatic categorisation, searching the content and linking to related information.
In this work, we present a dataset that contains comprehensive semi-professional user generated (SPUG) content, including audiovisual content, user-contributed metadata, automatic speech recognition transcripts, automatic shot boundary les, and social information for multiple `social levels'. We describe the principal characteristics of this dataset and present results that have been achieved on different tasks
- âŠ